According to the FBI, international terrorism is defined as "violent, criminal acts committed by individuals and/or groups who are inspired by, or associated with, designated foreign terrorist organizations or nations (state-sponsored)".
The purpose of this analysis is to discover what trends in the data there are and what it can tell us about global terrorism attacks in terms of where they occur, the types of terrorist attacks, what weapons were used, who the terrorist targets are, and who the largest terrorist groups are.
The variables of interest in this analysis are:
Year: Year the attack took place (1970-2017 is the range) Country: Country the terrorist attack took place in Region: Region the terrorist attack took place in City: City the terrorist attack took place in Attack Type: How the terrorist attacked the victim Weapon Type: Weapon used by terrorist to attack the victim Target: Who the target of this terrorist attack is Affiliation: What terrorist group is the terrorist part of
# pip install folium
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns # advanced visualization tool
import folium # visualize lat long in map
from folium.plugins import MarkerCluster
# Input data files are available in the directory.
from subprocess import check_output
df=pd.read_csv('globalterrorism (1).csv', encoding="ISO-8859-1")
C:\Users\Admin\AppData\Local\Temp\ipykernel_9084\4087428976.py:1: DtypeWarning: Columns (4,6,31,33,61,62,63,76,79,90,92,94,96,114,115,121) have mixed types. Specify dtype option on import or set low_memory=False.
df=pd.read_csv('globalterrorism (1).csv', encoding="ISO-8859-1")
df
| eventid | iyear | imonth | iday | approxdate | extended | resolution | country | country_txt | region | ... | addnotes | scite1 | scite2 | scite3 | dbsource | INT_LOG | INT_IDEO | INT_MISC | INT_ANY | related | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 197000000001 | 1970 | 7 | 2 | NaN | 0 | NaN | 58 | Dominican Republic | 2 | ... | NaN | NaN | NaN | NaN | PGIS | 0 | 0 | 0 | 0 | NaN |
| 1 | 197000000002 | 1970 | 0 | 0 | NaN | 0 | NaN | 130 | Mexico | 1 | ... | NaN | NaN | NaN | NaN | PGIS | 0 | 1 | 1 | 1 | NaN |
| 2 | 197001000001 | 1970 | 1 | 0 | NaN | 0 | NaN | 160 | Philippines | 5 | ... | NaN | NaN | NaN | NaN | PGIS | -9 | -9 | 1 | 1 | NaN |
| 3 | 197001000002 | 1970 | 1 | 0 | NaN | 0 | NaN | 78 | Greece | 8 | ... | NaN | NaN | NaN | NaN | PGIS | -9 | -9 | 1 | 1 | NaN |
| 4 | 197001000003 | 1970 | 1 | 0 | NaN | 0 | NaN | 101 | Japan | 4 | ... | NaN | NaN | NaN | NaN | PGIS | -9 | -9 | 1 | 1 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 181686 | 201712310022 | 2017 | 12 | 31 | NaN | 0 | NaN | 182 | Somalia | 11 | ... | NaN | "Somalia: Al-Shabaab Militants Attack Army Che... | "Highlights: Somalia Daily Media Highlights 2 ... | "Highlights: Somalia Daily Media Highlights 1 ... | START Primary Collection | 0 | 0 | 0 | 0 | NaN |
| 181687 | 201712310029 | 2017 | 12 | 31 | NaN | 0 | NaN | 200 | Syria | 10 | ... | NaN | "Putin's 'victory' in Syria has turned into a ... | "Two Russian soldiers killed at Hmeymim base i... | "Two Russian servicemen killed in Syria mortar... | START Primary Collection | -9 | -9 | 1 | 1 | NaN |
| 181688 | 201712310030 | 2017 | 12 | 31 | NaN | 0 | NaN | 160 | Philippines | 5 | ... | NaN | "Maguindanao clashes trap tribe members," Phil... | NaN | NaN | START Primary Collection | 0 | 0 | 0 | 0 | NaN |
| 181689 | 201712310031 | 2017 | 12 | 31 | NaN | 0 | NaN | 92 | India | 6 | ... | NaN | "Trader escapes grenade attack in Imphal," Bus... | NaN | NaN | START Primary Collection | -9 | -9 | 0 | -9 | NaN |
| 181690 | 201712310032 | 2017 | 12 | 31 | NaN | 0 | NaN | 160 | Philippines | 5 | ... | NaN | "Security tightened in Cotabato following IED ... | "Security tightened in Cotabato City," Manila ... | NaN | START Primary Collection | -9 | -9 | 0 | -9 | NaN |
181691 rows × 135 columns
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 181691 entries, 0 to 181690 Columns: 135 entries, eventid to related dtypes: float64(55), int64(22), object(58) memory usage: 187.1+ MB
df.columns
Index(['eventid', 'iyear', 'imonth', 'iday', 'approxdate', 'extended',
'resolution', 'country', 'country_txt', 'region',
...
'addnotes', 'scite1', 'scite2', 'scite3', 'dbsource', 'INT_LOG',
'INT_IDEO', 'INT_MISC', 'INT_ANY', 'related'],
dtype='object', length=135)
Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values
df.describe()
| eventid | iyear | imonth | iday | extended | country | region | latitude | longitude | specificity | ... | ransomamt | ransomamtus | ransompaid | ransompaidus | hostkidoutcome | nreleased | INT_LOG | INT_IDEO | INT_MISC | INT_ANY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 1.816910e+05 | 181691.000000 | 181691.000000 | 181691.000000 | 181691.000000 | 181691.000000 | 181691.000000 | 177135.000000 | 1.771340e+05 | 181685.000000 | ... | 1.350000e+03 | 5.630000e+02 | 7.740000e+02 | 552.000000 | 10991.000000 | 10400.000000 | 181691.000000 | 181691.000000 | 181691.000000 | 181691.000000 |
| mean | 2.002705e+11 | 2002.638997 | 6.467277 | 15.505644 | 0.045346 | 131.968501 | 7.160938 | 23.498343 | -4.586957e+02 | 1.451452 | ... | 3.172530e+06 | 5.784865e+05 | 7.179437e+05 | 240.378623 | 4.629242 | -29.018269 | -4.543731 | -4.464398 | 0.090010 | -3.945952 |
| std | 1.325957e+09 | 13.259430 | 3.388303 | 8.814045 | 0.208063 | 112.414535 | 2.933408 | 18.569242 | 2.047790e+05 | 0.995430 | ... | 3.021157e+07 | 7.077924e+06 | 1.014392e+07 | 2940.967293 | 2.035360 | 65.720119 | 4.543547 | 4.637152 | 0.568457 | 4.691325 |
| min | 1.970000e+11 | 1970.000000 | 0.000000 | 0.000000 | 0.000000 | 4.000000 | 1.000000 | -53.154613 | -8.618590e+07 | 1.000000 | ... | -9.900000e+01 | -9.900000e+01 | -9.900000e+01 | -99.000000 | 1.000000 | -99.000000 | -9.000000 | -9.000000 | -9.000000 | -9.000000 |
| 25% | 1.991021e+11 | 1991.000000 | 4.000000 | 8.000000 | 0.000000 | 78.000000 | 5.000000 | 11.510046 | 4.545640e+00 | 1.000000 | ... | 0.000000e+00 | 0.000000e+00 | -9.900000e+01 | 0.000000 | 2.000000 | -99.000000 | -9.000000 | -9.000000 | 0.000000 | -9.000000 |
| 50% | 2.009022e+11 | 2009.000000 | 6.000000 | 15.000000 | 0.000000 | 98.000000 | 6.000000 | 31.467463 | 4.324651e+01 | 1.000000 | ... | 1.500000e+04 | 0.000000e+00 | 0.000000e+00 | 0.000000 | 4.000000 | 0.000000 | -9.000000 | -9.000000 | 0.000000 | 0.000000 |
| 75% | 2.014081e+11 | 2014.000000 | 9.000000 | 23.000000 | 0.000000 | 160.000000 | 10.000000 | 34.685087 | 6.871033e+01 | 1.000000 | ... | 4.000000e+05 | 0.000000e+00 | 1.273412e+03 | 0.000000 | 7.000000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| max | 2.017123e+11 | 2017.000000 | 12.000000 | 31.000000 | 1.000000 | 1004.000000 | 12.000000 | 74.633553 | 1.793667e+02 | 5.000000 | ... | 1.000000e+09 | 1.320000e+08 | 2.750000e+08 | 48000.000000 | 7.000000 | 2769.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
8 rows × 77 columns
Correlation, gives the relation to the directly proportional and inversely proportional of data with each other. If the correlation is 0, the two properties are irrelevant
df.corr()
| eventid | iyear | imonth | iday | extended | country | region | latitude | longitude | specificity | ... | ransomamt | ransomamtus | ransompaid | ransompaidus | hostkidoutcome | nreleased | INT_LOG | INT_IDEO | INT_MISC | INT_ANY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| eventid | 1.000000 | 0.999996 | 0.002706 | 0.018336 | 0.091761 | -0.135039 | 0.401371 | 0.166886 | 0.003907 | 0.030641 | ... | -0.009990 | -0.018001 | -0.014094 | -0.165422 | 0.256113 | -0.181612 | -0.143600 | -0.133252 | -0.077852 | -0.175605 |
| iyear | 0.999996 | 1.000000 | 0.000139 | 0.018254 | 0.091754 | -0.135023 | 0.401384 | 0.166933 | 0.003917 | 0.030626 | ... | -0.009984 | -0.018216 | -0.014238 | -0.165375 | 0.256092 | -0.181556 | -0.143601 | -0.133253 | -0.077847 | -0.175596 |
| imonth | 0.002706 | 0.000139 | 1.000000 | 0.005497 | -0.000468 | -0.006305 | -0.002999 | -0.015978 | -0.003880 | 0.003621 | ... | -0.000710 | 0.046989 | 0.058878 | -0.016597 | 0.011295 | -0.011535 | -0.002302 | -0.002034 | -0.002554 | -0.006336 |
| iday | 0.018336 | 0.018254 | 0.005497 | 1.000000 | -0.004700 | 0.003468 | 0.009710 | 0.003423 | -0.002285 | -0.006991 | ... | 0.012755 | -0.010502 | 0.003148 | -0.006581 | -0.006706 | 0.001765 | -0.001540 | -0.001621 | -0.002027 | -0.001199 |
| extended | 0.091761 | 0.091754 | -0.000468 | -0.004700 | 1.000000 | -0.020466 | 0.038389 | -0.024749 | 0.000523 | 0.057897 | ... | -0.008114 | 0.028177 | 0.001966 | 0.009367 | 0.233293 | -0.192155 | 0.071768 | 0.075147 | 0.027335 | 0.080767 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| nreleased | -0.181612 | -0.181556 | -0.011535 | 0.001765 | -0.192155 | -0.044331 | -0.149511 | 0.002790 | -0.017745 | -0.030631 | ... | 0.054571 | 0.034843 | 0.049322 | 0.016832 | -0.555478 | 1.000000 | 0.039388 | 0.040947 | 0.085055 | 0.064759 |
| INT_LOG | -0.143600 | -0.143601 | -0.002302 | -0.001540 | 0.071768 | 0.069904 | -0.082584 | -0.099827 | 0.002272 | 0.073022 | ... | 0.035821 | 0.031079 | 0.007029 | -0.045504 | -0.015442 | 0.039388 | 1.000000 | 0.996211 | 0.052537 | 0.891051 |
| INT_IDEO | -0.133252 | -0.133253 | -0.002034 | -0.001621 | 0.075147 | 0.067564 | -0.071917 | -0.094470 | 0.002268 | 0.071333 | ... | 0.039053 | 0.041983 | 0.013162 | -0.039844 | -0.016234 | 0.040947 | 0.996211 | 1.000000 | 0.082014 | 0.893811 |
| INT_MISC | -0.077852 | -0.077847 | -0.002554 | -0.002027 | 0.027335 | 0.207281 | 0.043139 | 0.097652 | 0.000371 | -0.019197 | ... | 0.023815 | 0.125162 | 0.037227 | 0.129274 | -0.119776 | 0.085055 | 0.052537 | 0.082014 | 1.000000 | 0.252193 |
| INT_ANY | -0.175605 | -0.175596 | -0.006336 | -0.001199 | 0.080767 | 0.153118 | -0.047900 | -0.041530 | 0.002497 | 0.061389 | ... | 0.028054 | 0.053484 | 0.007275 | 0.056438 | -0.061946 | 0.064759 | 0.891051 | 0.893811 | 0.252193 | 1.000000 |
77 rows × 77 columns
df.head()
| eventid | iyear | imonth | iday | approxdate | extended | resolution | country | country_txt | region | ... | addnotes | scite1 | scite2 | scite3 | dbsource | INT_LOG | INT_IDEO | INT_MISC | INT_ANY | related | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 197000000001 | 1970 | 7 | 2 | NaN | 0 | NaN | 58 | Dominican Republic | 2 | ... | NaN | NaN | NaN | NaN | PGIS | 0 | 0 | 0 | 0 | NaN |
| 1 | 197000000002 | 1970 | 0 | 0 | NaN | 0 | NaN | 130 | Mexico | 1 | ... | NaN | NaN | NaN | NaN | PGIS | 0 | 1 | 1 | 1 | NaN |
| 2 | 197001000001 | 1970 | 1 | 0 | NaN | 0 | NaN | 160 | Philippines | 5 | ... | NaN | NaN | NaN | NaN | PGIS | -9 | -9 | 1 | 1 | NaN |
| 3 | 197001000002 | 1970 | 1 | 0 | NaN | 0 | NaN | 78 | Greece | 8 | ... | NaN | NaN | NaN | NaN | PGIS | -9 | -9 | 1 | 1 | NaN |
| 4 | 197001000003 | 1970 | 1 | 0 | NaN | 0 | NaN | 101 | Japan | 4 | ... | NaN | NaN | NaN | NaN | PGIS | -9 | -9 | 1 | 1 | NaN |
5 rows × 135 columns
df.tail()
| eventid | iyear | imonth | iday | approxdate | extended | resolution | country | country_txt | region | ... | addnotes | scite1 | scite2 | scite3 | dbsource | INT_LOG | INT_IDEO | INT_MISC | INT_ANY | related | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 181686 | 201712310022 | 2017 | 12 | 31 | NaN | 0 | NaN | 182 | Somalia | 11 | ... | NaN | "Somalia: Al-Shabaab Militants Attack Army Che... | "Highlights: Somalia Daily Media Highlights 2 ... | "Highlights: Somalia Daily Media Highlights 1 ... | START Primary Collection | 0 | 0 | 0 | 0 | NaN |
| 181687 | 201712310029 | 2017 | 12 | 31 | NaN | 0 | NaN | 200 | Syria | 10 | ... | NaN | "Putin's 'victory' in Syria has turned into a ... | "Two Russian soldiers killed at Hmeymim base i... | "Two Russian servicemen killed in Syria mortar... | START Primary Collection | -9 | -9 | 1 | 1 | NaN |
| 181688 | 201712310030 | 2017 | 12 | 31 | NaN | 0 | NaN | 160 | Philippines | 5 | ... | NaN | "Maguindanao clashes trap tribe members," Phil... | NaN | NaN | START Primary Collection | 0 | 0 | 0 | 0 | NaN |
| 181689 | 201712310031 | 2017 | 12 | 31 | NaN | 0 | NaN | 92 | India | 6 | ... | NaN | "Trader escapes grenade attack in Imphal," Bus... | NaN | NaN | START Primary Collection | -9 | -9 | 0 | -9 | NaN |
| 181690 | 201712310032 | 2017 | 12 | 31 | NaN | 0 | NaN | 160 | Philippines | 5 | ... | NaN | "Security tightened in Cotabato following IED ... | "Security tightened in Cotabato City," Manila ... | NaN | START Primary Collection | -9 | -9 | 0 | -9 | NaN |
5 rows × 135 columns
df.columns
Index(['eventid', 'iyear', 'imonth', 'iday', 'approxdate', 'extended',
'resolution', 'country', 'country_txt', 'region',
...
'addnotes', 'scite1', 'scite2', 'scite3', 'dbsource', 'INT_LOG',
'INT_IDEO', 'INT_MISC', 'INT_ANY', 'related'],
dtype='object', length=135)
#Line Plot
#kind = type of plot, color = color, label = label, linewidth = width of line, alpha = opacity, grid = grid, linestyle = sytle of line
df.nkillus.plot(kind = 'line', color = 'red', label = 'The Number of Total Confirmed Fatalities for US', linewidth = 2, alpha = 0.8, grid = True,
linestyle = ':', figsize = (40,40), fontsize=15)
df.nwoundus.plot(color = "green", label = 'The Number of Confirmed Non-Fatal Injuries for US', linewidth = 2, alpha = 1, grid = True,
linestyle = '-.', figsize = (20,20), fontsize=15)
plt.legend(loc='upper right',fontsize=15) # legend = puts label into plot
plt.xlabel('Database Index', fontsize=30) # label = name of label
plt.ylabel('Number of Dead or Injuries', fontsize=30)
plt.title('Confirmed Fatalities & Non-Fatal Injuries for US',fontsize=40) #plot title
plt.show()
Given that the data is sorted by dates, attacks on US citizens seem to be very rare in a given date range. But the terrorist act against the citizens of US has been increasingly in the following year after this rare date range. By finding the date of the start of the increase, the factors in increasing terrorist acts can be easily identified by taking into account the changes and developments in the country after this date.
# Scatter Plot
# Generally, is used to compare two different features.
# Right here, x = Target type, y = Success
df.plot(kind = 'scatter', x = 'nkill', y = 'nwound', alpha = 0.5, color = 'red', figsize = (40,40), fontsize=40)
plt.xlabel('Kill', fontsize=30)
plt.ylabel('Wound', fontsize=30)
plt.title('Kill - Wound Scatter Plot',fontsize=40)
plt.show()
In the majority of acts of terrorism, the mortality rate and injuries were low, but a small number of actions led to too many deaths and injuries
TERRORIST ATTACKS OF A PARTICULAR YEAR AND THEIR LOCATIONS
Let's look at the terrorist acts in the world over a certain year.
filterYear = df['iyear'] == 1970 # filter the terrorist acts
filterData = df[filterYear] # filter data
# filterData.info()
reqFilterData = filterData.loc[:,'city':'longitude'] #We are getting the required fields
reqFilterData = reqFilterData.dropna() # drop NaN values in latitude and longitude
reqFilterDataList = reqFilterData.values.tolist()
# reqFilterDataList
# map: location = camera location, zoom_start = initial zoom size, tiles = map background
# marker: location = marker location, popup = popup message(str)
map = folium.Map(location = [0, 30], tiles='CartoDB positron', zoom_start=4)
# clustered marker
markerCluster = folium.plugins.MarkerCluster().add_to(map)
for point in range(0, len(reqFilterDataList)):
folium.Marker(location=[reqFilterDataList[point][1],reqFilterDataList[point][2]], popup = reqFilterDataList[point][0]).add_to(markerCluster)
map
84% of the terrorist attacks in 1970 were carried out on the American continent. In 1970, the Middle East and North Africa, currently the center of wars and terrorist attacks, faced only one terrorist attack.
According to records, the total number of people killed in terrorist attacks
killData = df.loc[:,'nkill']
print('Number of people killed by terror attack:', int(sum(killData.dropna())))# drop the NaN values
Number of people killed by terror attack: 411868
Number of people killed by terror attack: 411868
countryData = df.loc[:,'country':'country_txt']
# countyData
countryKillData = pd.concat([countryData, killData], axis=1)
# countryKillData
# pivot table sum kill values for the same country_txt
countryKillFormatData = countryKillData.pivot_table(columns='country_txt', values='nkill', aggfunc='sum')
countryKillFormatData
| country_txt | Afghanistan | Albania | Algeria | Andorra | Angola | Antigua and Barbuda | Argentina | Armenia | Australia | Austria | ... | Vietnam | Wallis and Futuna | West Bank and Gaza Strip | West Germany (FRG) | Western Sahara | Yemen | Yugoslavia | Zaire | Zambia | Zimbabwe |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| nkill | 39384.0 | 42.0 | 11066.0 | 0.0 | 3043.0 | 0.0 | 490.0 | 37.0 | 23.0 | 30.0 | ... | 1.0 | 0.0 | 1500.0 | 97.0 | 1.0 | 8776.0 | 119.0 | 324.0 | 70.0 | 154.0 |
1 rows × 205 columns
countryKillFormatData.info()
<class 'pandas.core.frame.DataFrame'> Index: 1 entries, nkill to nkill Columns: 205 entries, Afghanistan to Zimbabwe dtypes: float64(205) memory usage: 1.6+ KB
I changed this because the view is corrupted when too much data is put into a bar chart. Using 50 data in each plot made everything more clear
# fig_size used to resize the graphic
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=50
fig_size[1]=50
plt.rcParams["figure.figsize"] = fig_size
labels = countryKillFormatData.columns.tolist()
labels = labels[:50] #50 bar provides nice view
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[:50]
values = [int(i[0]) for i in values] # convert float to int
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange'] # color list for bar chart bar color
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=40)
# print(fig_size)
plt.show()
labels = countryKillFormatData.columns.tolist()
labels = labels[50:101]
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[50:101]
values = [int(i[0]) for i in values]
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange']
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=30)
plt.show()
labels = countryKillFormatData.columns.tolist()
labels = labels[101:152]
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[101:152]
values = [int(i[0]) for i in values]
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange']
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=30)
plt.show()
labels = countryKillFormatData.columns.tolist()
labels = labels[152:206]
index = np.arange(len(labels))
transpoze = countryKillFormatData.T
values = transpoze.values.tolist()
values = values[152:206]
values = [int(i[0]) for i in values]
colors = ['red', 'green', 'blue', 'purple', 'yellow', 'brown', 'black', 'gray', 'magenta', 'orange']
fig, ax = plt.subplots(1, 1)
ax.yaxis.grid(True)
fig_size = plt.rcParams["figure.figsize"]
fig_size[0]=25
fig_size[1]=25
plt.rcParams["figure.figsize"] = fig_size
plt.bar(index, values, color = colors, width = 0.9)
plt.ylabel('Killed People', fontsize=30)
plt.xticks(index, labels, fontsize=12, rotation=90)
plt.title('Number of people killed by countries',fontsize=30)
plt.show()
Terrorist acts in the Middle East and northern Africa have been seen to have fatal consequences. The Middle East and North Africa are seen to be the places of serious terrorist attacks. In addition, even though there is a perception that Muslims are supporters of terrorism, Muslims are the people who are most damaged by terrorist attacks. If you look at the graphics, it appears that Iraq, Afghanistan and Pakistan are the most damaged countries. All of these countries are Muslim countries
plt.subplots(figsize=(15,6))
sns.barplot(df['iyear'].value_counts().index, df['iyear'].value_counts().values)
plt.xticks(rotation=90)
plt.xlabel('year', fontsize=20)
plt.title('Number Of Terrorist Activities Each Year',fontsize=30)
plt.show()
C:\Users\Admin\.conda\envs\datascience\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation. warnings.warn(
This bar plot shows us that in there is a huge increase in terrorism during the 2000's as opposed to 1970-2000. 2014 had the most terrorist attacks. The biggest increase in terrorist attacks is from from 2011-2014 and that after 2014, the number of terrorist attacks is decreasing.
According to records, the total number of people killed in terrorist attacks:
killData = df.loc[:,'nkill']
print('Number of people killed by terror attack:', int(sum(killData.dropna())))# drop the NaN values
Number of people killed by terror attack: 411868
Number of people killed by terror attack: 411868
pd.crosstab(df.iyear, df.region_txt).plot(kind='area',figsize=(15,6))
plt.title('Terrorist Activities by Region in each Year',fontsize=30)
plt.ylabel('Number of Attacks',fontsize=20)
plt.show()
Here we see that South Asia, the Middle East & North Africa, Sub-Saharan-Africa, and South America have the most terrorism attacks. There is also a trend that the Western countries tend to have less terrorism attacks than the 3rd world countries. It is also not surprising to see that regions like South Asia, Middle East, Africa, and South America are the top ranking in terms of terrorism attacks due to the large disparities in wealth, differences in religions, as well as territorial disputes over oil.
terror_region=pd.crosstab(df.iyear,df.region_txt)
terror_region.plot(color=sns.color_palette('Set2',12))
fig=plt.gcf()
fig.set_size_inches(18,6)
plt.show()
top_groups10=df[df['gname'].isin(df['gname'].value_counts()[1:11].index)]
pd.crosstab(top_groups10.iyear,top_groups10.gname).plot(color=sns.color_palette('Paired',10))
fig=plt.gcf()
fig.set_size_inches(18,6)
plt.show()
we can see that the too terrorist groups are the ISIL, Taliban,Shining Path and Al-Shabaab. The first 4 groups aren't surprising since they are all in the Middle East & North Africa areas, but New People's Army is the armed wing of the Communist Party in the Philippines
df.city.value_counts().head(15)
Unknown 9775 Baghdad 7589 Karachi 2652 Lima 2359 Mosul 2265 Belfast 2171 Santiago 1621 Mogadishu 1581 San Salvador 1558 Istanbul 1048 Athens 1019 Bogota 984 Kirkuk 925 Beirut 918 Medellin 848 Name: city, dtype: int64
#Creating new dataframe without Unknown Category
filtered = df[df['city'] != 'Unknown']
#Barplot
plt.subplots(figsize=(15,6))
sns.barplot(filtered['city'].value_counts().head(15).index, filtered['city'].value_counts().head(15).values,
palette = "viridis")
plt.xticks(rotation=90)
plt.title('Cities With The Most Terrorist Attacks',fontsize=30)
plt.show()
C:\Users\Admin\.conda\envs\datascience\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variables as keyword args: x, y. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation. warnings.warn(
From the barplot, we see that the top 5 cities with terrorism attacks are Belfast, Baghdad, Karachi, Lima, and San Salvador. It is surprising to see that Belfast, Ireland is the top city for terrorist attacks since according to the interactive map we just created, Ireland isn't one of the countries with the highest terrorist attacks
#Selecting Columns
data = df[['iyear','country_txt','city','region_txt','attacktype1_txt','weaptype1_txt','targtype1_txt', 'gname']]
data.head()
| iyear | country_txt | city | region_txt | attacktype1_txt | weaptype1_txt | targtype1_txt | gname | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1970 | Dominican Republic | Santo Domingo | Central America & Caribbean | Assassination | Unknown | Private Citizens & Property | MANO-D |
| 1 | 1970 | Mexico | Mexico city | North America | Hostage Taking (Kidnapping) | Unknown | Government (Diplomatic) | 23rd of September Communist League |
| 2 | 1970 | Philippines | Unknown | Southeast Asia | Assassination | Unknown | Journalists & Media | Unknown |
| 3 | 1970 | Greece | Athens | Western Europe | Bombing/Explosion | Explosives | Government (Diplomatic) | Unknown |
| 4 | 1970 | Japan | Fukouka | East Asia | Facility/Infrastructure Attack | Incendiary | Government (Diplomatic) | Unknown |
import plotly.express as px
#Making a new column called AttackCount which is the sum of each type of terrorist attack and adding it to dataframe
data['AttackCount'] = df.attacktype1_txt.groupby(df.attacktype1_txt).transform('count')
#Creating new Dataframe to get only Attack Type and AttackCount and dropping duplicates from Attack Type
data1 = data.copy()
data2 = data1[['attacktype1_txt','AttackCount']]
data3 = data2.drop_duplicates(keep='first')
#Pie Chart
fig = px.pie(data3, values="AttackCount",
names="attacktype1_txt",title='Terrorist Attack Types',
color_discrete_sequence=px.colors.sequential.RdBu)
fig.show()
C:\Users\Admin\AppData\Local\Temp\ipykernel_9084\2084340030.py:3: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
data['AttackCount'] = df.attacktype1_txt.groupby(df.attacktype1_txt).transform('count')
From the pie chart, the most common type of terrorist weapons used are Bombing/Explosions, followed by Armed Assault, and Assassinations. The reason that bombings/explosions are the most common type of terrorist attack is likely due to how the information to make bombs is easily available, it is easier to use a bomb or explosive than attempt things like armed assault or assassinations, as well as the fact that there aren't things like background checks when it comes to getting a bomb as opposed to trying to get a firearm or get close enough to assassinate someone important.